Video saliency prediction using enhanced spatiotemporal alignment network

نویسندگان

چکیده

Due to a variety of motions across different frames, it is highly challenging learn an effective spatiotemporal representation for accurate video saliency prediction (VSP). To address this issue, we develop feature alignment network tailored VSP, mainly including two key sub-networks: multi-scale deformable convolutional (MDAN) and bidirectional Long Short-Term Memory (Bi-ConvLSTM) network. The MDAN learns align the features neighboring frames reference one in coarse-to-fine manner, which can well handle various motions. Specifically, owns pyramidal hierarchy structure that first leverages convolution (Dconv) lower-resolution then aggregates aligned higher-resolution features, progressively enhancing from top bottom. output fed into Bi-ConvLSTM further enhancement, captures useful long-time temporal information along forward backward timing directions effectively guide attention orientation shift under complex scene transformation. Finally, enhanced are decoded generate predicted map. proposed model trained end-to-end without any intricate post processing. Extensive evaluations on four VSP benchmark datasets demonstrate method achieves favorable performance against state-of-the-art methods. source codes all results will be released at https://github.com/cj4L/ESAN-VSP.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatiotemporal saliency for video classification

Computer vision applications often need to process only a representative part of the visual input rather than the whole image/sequence. Considerable research has been carried out into salient region detection methods based either on models emulating human visual attention (VA) mechanisms or on computational approximations. Most of the proposed methods are bottom-up and their major goal is to fi...

متن کامل

Region-Based Multiscale Spatiotemporal Saliency for Video

Detecting salient objects from a video requires exploiting both spatial and temporal knowledge included in the video. We propose a novel region-based multiscale spatiotemporal saliency detection method for videos, where static features and dynamic features computed from the low and middle levels are combined together. Our method utilizes such combined features spatially over each frame and, at ...

متن کامل

Attention Prediction in Egocentric Video Using Motion and Visual Saliency

We propose a method of predicting human egocentric visual attention using bottom-up visual saliency and egomotion information. Computational models of visual saliency are often employed to predict human attention; however, its mechanism and effectiveness have not been fully explored in egocentric vision. The purpose of our framework is to compute attention maps from an egocentric video that can...

متن کامل

Unsupervised Video Analysis Based on a Spatiotemporal Saliency Detector

Visual saliency, which predicts regions in the field of view that draw the most visual attention, has attracted a lot of interest from researchers. It has already been used in several vision tasks, e.g., image classification, object detection, foreground segmentation. Recently, the spectrum analysis based visual saliency approach has attracted a lot of interest due to its simplicity and good pe...

متن کامل

A spatiotemporal weighted dissimilarity-based method for video saliency detection

Accurately modeling and predicting the visual attention behavior of human viewers can help a video analysis algorithm find interesting regions by reducing the search effort of tasks, such as object detection and recognition. In recent years, a great number and variety of visual attention models for predicting the direction of gaze on images and videos have been proposed. When a human views vide...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2021

ISSN: ['1873-5142', '0031-3203']

DOI: https://doi.org/10.1016/j.patcog.2020.107615